A Better Benchmark Assessment 



1 



Running Head: A BETTER BENCHMARK ASSESSMENT 



A Better Benchmark Assessment: 
Multiple -Choice Versus Project-Based 
Jamon F. Peariso 

University of La Verne Masters Thesis 
Education 596: Graduate Seminar 



Summer 2006 




A Better Benchmark Assessment 2 
Abstract 

The purpose of this literature review and Ex Post Facto 
descriptive study was to determine which type of benchmark 
assessment, multiple-choice or pro ject-based, provides the best 
indication of general success on the history portion of the CST. 
The result of the study indicates that although the project- 
based benchmark assessment was better than the multiple-choice 
benchmark assessment at predicting the student acquisition of 
the desired "proficient" or above level on the CST, the data was 
inconclusive. However, the study did reveal that both types of 
benchmark assessments were successful in predicting acquisition 
of the "basic" or above level on the CST. 




A Better Benchmark Assessment 



3 



Table of Contents 
CHAPTER 

I . THE PROBLEM 

Introduction 6 

Statement of the Problem 7 

Statement of Purpose 8 

Importance of the Study 9 

Setting 9 

Definition of Terms 10 

II. LITERATURE REVIEW 

Mandated Testing and the NCLB Act 14 

History 14 

"High Stakes" Testing 15 

No Child Left Behind Act 17 

Benchmark Assessments 20 

History 20 

Positives and Negatives 21 

Multiple-Choice Assessments 23 

History 23 

Positives and Negatives 24 

Project-Based Assessments 27 

History 27 



Positives and Negatives 



28 




A Better Benchmark Assessment 



4 



Benchmarks: The ACES 30 

III. METHODOLOGY 

Description of the Research 35 

Research Design 35 

Selection of Subjects 35 

Data Gathering 36 

Data Analysis 36 

Limitations 36 

IV. DATA PRESENTATIONS AND ANALYSIS 

Presentation of Data 39 

Analysis of Results 43 

Summary 4 4 

V. CONCLUSIONS AND RECOMMENDATIONS 

Conclusions 47 

Implications and Inferences 48 

Recommendations for Further Study 51 

REFERENCES 52 

APPENDIXES 6 0 

Appendix A: Project-Based ACES 60 



Appendix B: Multiple-Choice Based ACES 



63 




A Better Benchmark Assessment 5 
FIGURES 

Figure 1: 2005 DJUHSD ACES Passing Percentages 40 

Figure 2: 2005 DJUHSD CST Proficiency Percentages 41 

Figure 3: Passed ACES to "Proficient" CST Comparisons ... 42 

Figure 4: Passed ACES to "Basic" CST Comparisons 46 




A Better Benchmark Assessment 



A Better Benchmark Assessment 

CHAPTER I : THE PROBLEM 
Introduction 

With the passage of the No Child Left Behind Act (NCLB) , 
high stakes standardized testing has become the rule throughout 
the United States. School districts and their administrators are 
scrambling to figure out ways to motivate students to perform 
well. They are also mandating teachers to make these tests the 
focal point in their lesson planning and instruction through 
alignment and integration of the state content standards (Louis, 
Febey, & Schroeder, 2005, p. 177). 

The next logical step these districts have followed has 
been to assess students' progress toward mastery of these state 
content standards and the teachers' responsibility to teach 
these standards so the students perform well on the mandated 
tests (Olson, 2005a) . 

What this process has created is a system of evaluating the 
students throughout the course of their instruction called 
benchmark assessment or testing. The objectives in benchmark 
testing generally three fold: first, to assess students' 
progress toward mastery of the content standards, which the 
mandated tests are based, second, to proved a minimum level of 



mastery and accountability for the student before they are able 




A Better Benchmark Assessment 



7 



to progress to the next class or grade level, and third, to give 
the teacher a functional assessment tool to gage students' 
mastery of what was taught (Herman & Baker, 2005; Swanson & 
Collins, 1999 ) . 

These benchmark assessments come is a variety of forms 
(multiple choice, construction response, or project-based) , 
which are either developed by the educational industry, then 
purchased by a district, or they are formed through cohort 
collaboration within a district (Darling-Hammond, 1994; Shepard, 
1995) . 

Statement of the Problem 

One facet of the benchmark assessment issue is that the 
mandated standardized tests are primarily multiple-choice. Many 
educators believe that multiple-choice assessments, in general, 
only assess a student at a basic cognitive level (Burton, 2005; 
Simkin & Kuechler, 2005). These educators also see multiple- 
choice benchmarks similarly; teaching and assessing content 
standards at the same basic cognitive level. Educators also see 
such assessments as "teaching to the test, " which has become a 
negative phrase that connotes the lack of affective learning 
(Burton, 2005; Popham, 2004). 

The other facet of this issue is that this benchmarking 



system has created an era of school districts and administrators 




A Better Benchmark Assessment 



implementing and mandating testing and curriculum that teaches 
to the content of the mandated tests. The result has created 
many complaints by educators and researchers alike. Their 
complaints aren't with accountability , but rather with what the 
above scenario has caused many teachers to do; to leave behind 
assessments that gauge students at a deeper cognitive level, and 
into the affective domain (McNeil, 2000; Olson, 2005b; Sheldon & 
Biddle, 1998; Shepard, 1989). 

The question this creates is what type of general benchmark 
assessment, multiple-choice (which generally is assumed to 
assess the cognitive domain) or project-based (which generally 
is assumed to assess the affective domain) , is more effective in 
assessing a students' knowledge of the content standards and 
therefore, a relevant indicator of success on the California 
Standards Test (CST) portions of the Standardized Testing and 
Reporting (STAR) Tests? 

Statement of Purpose 

The purpose of this study, through a literature review and 
an Ex Post Facto descriptive study is to determine which type of 
benchmark assessment, multiple-choice or project-based, provides 
the best indication of successful acquisition of the California 
History-Social Science Content Standards, and therefore general 



success on the history portion of the CST. 




A Better Benchmark Assessment 



Importance of the Study 

In an age of benchmark testing, high stakes tests, and the 
NCLB Act, this study can assist districts and educators in the 
process of creating a more productive benchmark assessment to 
gauge students' knowledge of content standards and therefore, 
creating a tool that can be used to properly predict success on 
the history portion of the CST. 

Setting 

The study was conducted at the Delano Joint Union High 
School District (DJUHSD) in Delano, California. DJUHSD contains 
two comprehensive high schools (Delano High School and Cesar E. 
Chavez High School) with a population of roughly 2,000 students 
at each campus. The ethnic demographics of DJUHSD in the 2004- 
2005 school years are: 81% Hispanic, 13% Filipino, 3% White, 2% 
African American, 1% Asian, and less than 0.1% other. 61% of all 
students are English Language Learners and 79% are socio- 
economically disadvantaged . 

This study focuses on all students enrolled in all levels 
(Sheltered, High School, College Prep., and Honors) of World 
History (10 th grade year) and United States History (11 th grade 
year). A.P. United States history classes will not be included 
in this study because A.P. classes are exempt for the district 



mandated benchmark tests (ACES). 




A Better Benchmark Assessment 



10 



Definition of Terms 
Accountability 

Accountability is an obligation of teachers and other 
school personnel to accept responsibility for students' 
performance on high-stakes assessments; often mandated by policy 
makers calling for school reform. 

Assessment 

Assessment is any process that measures student 
learning or abilities. It can be conducted in formal or informal 
ways, and range in form from, but not limited to, tests, essays, 
projects, oral presentations, or portfolio projects. 

Assessments of Core Exit Standards (ACES) 

The ACES are the benchmark assessments that the DJUHSD uses 
which are developed through district-wide cohort collaboration 
that generally comes in two forms: multiple-choice and project- 
based . 

Authentic Learning/Assessment 
Authentic learning/assessment is any form of assessment 
where tasks are set in a meaningful context that provides 
connections between real world experiences and school-based 
ideas. At times authentic assessment is interchangeable with 
project-based and performance-based learning/assessment even 



though each has a slightly different focus. 




A Better Benchmark Assessment 



11 



Benchmark Assessments 

Benchmark Assessments are standard-based assessments 
administered at regular intervals which are used to determine 
student growth and student performance relative to statewide 
grade-level achievement expectations . 

California History-Social Science Content Standards 
Academic content standards for tenth through grade twelve, 
adopted by the California State Board of Education. 

California Standards Tests (CST) 

The California Standards Tests (CST) show how well students 
are doing in relation to the state content standards which are a 
portion of the STAR Test. Student scores are reported as 
performance levels. The five performance levels are "advanced" 
(exceeds state standards ) j "proficient" (meets state standards ) f 
"basic" (approaching state standards ) f "below basic" (below 
state standards ) , and "far below basic" (well below state 
standards). Students scoring at the "proficient" or "advanced" 
level have met state standards in that content area. 

High-Stakes Testing 

High-stakes testing is the practice of using students' 
performance on a single assessment to make major decisions about 



students or school personnel. 




A Better Benchmark Assessment 12 
No Child Left Behind Act of 2001 (NCLB) 

NCLB (public law 107-110) is a United States federal law 
that reauthorizes a number of federal programs which aim is to 
improve the performance of America’s primary and secondary 
schools by increasing the standards of accountability for 
states, school districts, and schools, as well as providing 
parents more flexibility in choosing which schools their 
children will attend. Additionally, it promotes an increased 
focus on reading and re-authorizes the Elementary and Secondary 
Education Act of 1965 (ESEA) . 

Performance-Based Learning/Assessment 
Performance-based learning is a term commonly 
interchangeable with Authentic Learning/Assessment with the 
slight difference being that performance-based assessment 
focuses on assessing tasks a student can do. 

Project-Based learning/Assessment 
Project-based learning is an approach that intends to bring 
about deep learning by allowing learners to encounter problem 
solving opportunities or research in the context of a complex, 
open-ended project . 

Standardized Testing and Reporting (STAR) Test 
State of California mandated achievement test that assesses 



students' knowledge according to the California Content 




A Better Benchmark Assessment 



13 



Standards for the relative grade level in the core fields of 
language arts, mathematics, science, and social science through 
the individualized California Standards Tests (CST) . 

Traditional Assessment 

A traditional assessment is a test-based (pencil-paper) 
form of assessment. It can be formal or informal, include 
multiple choice, true/false, or f ill-in-the-blank questions. 




A Better Benchmark Assessment 



14 



CHAPTER II: LITERATURE REVIEW 
Mandated Testing and the NCLB Act 
History 

What began the road to government mandated testing started 
in the early 1900s with the compulsory elementary school 
attendance laws. Prior to these laws educational assessment and 
testing in the United States was generally for the purpose of 
assessing whether a student could apply their knowledge to a 
specific task. These assessments concentrated on a syllabus, 
curriculum or craft (Madaus, 1993). 

The elementary attendance laws were designed to guarantee 
the education of the growingly large and ethnically diverse 
population of students. The focus of these laws was efficiency 
in education and to meet the requirements of growing 
industrialization. These attendance laws also created an 
assembly-line method of organizing schools: linear progression 
of grades, and standard curriculum (Stiggins, 1991). 

Generally, this new system of schools where used to weed 
out the students headed to the assembly line from the college 
bound students. Because of this weeding out process, assessments 
needed to be able to detect individual differences in 



achievement among students (Stiggins, 1991). 




A Better Benchmark Assessment 



15 



The successful World War I era Army Alpha standardized 
aptitude test became the wave of the future in education. The 
most influential characteristic of the educational field's shift 
towards the Army's standardized testing methods was the 
separating of duties; assessment and instruction became 
separated. This led to layered school-wide, district-wide, 
state-wide, and nation-wide testing programs layered on top of 
each other, which systems are still being used today (Stiggins, 
1991) . 

From the late 1930s' college admission tests, to the 
explosion of standardized tests in the 1950s, a proliferation of 
these published standardized tests began to be used as 
accountability tools for the first time, but on a small scale 
(Clarke, Madaus, Horn, & Ramos, 2000; Stiggins, 1991). 

"High Stakes" Testing 

The introduction to the idea of "high stakes," holding 
students, districts, and states accountable, was introduced in 
the 1970s: 

American educators were inundated with legislative 
requirements for testing that were part of the "educational 
accountability movement." State legislators, dismayed by 
what they believed to be ineffectual public schooling. 



mandated that a variety of obligatory tests be established 




A Better Benchmark Assessment 



16 



to show whether students could display at least minimal 
competence in the three R f s. Sometimes, a student 1 s receipt 
of a high school diploma or promotion to the next grade was 
linked to performance on these competency tests (Popham, 
1993, p. 471) . 

These mandated tests led to an unanticipated use to compare 
test scores of school districts within a state. The results of 
the tests were then used to hold students back or to indicate 
educator's effectiveness (Popham, 1993; Stiggins, 1991). 

Popham (1993) goes on to explain: 

But even the architects of the numerous new statewide 
testing programs failed to recognize the profound impact 
that these high-stakes tests would have on instruction. 
Because teachers wanted to make sure that their students 
would be promoted (or, in some instances, because they 
feared the wrath of parents if students were not promoted) , 
they began to emphasize in their instruction the knowledge 
and skills that were being tested. Because administrators 
wanted their schools (or districts) to look good when local 
newspapers published test results, they encouraged teachers 
to give ample instructional attention to the content to be 
tested. The pressures to boost test scores became 



pervasive (p. 471) . 




A Better Benchmark Assessment 



17 



In 1983, a publication called A Nation at Risk by the 
National Commission on Excellence in Education made 
recommendations which were quickly picked up by the media. Many 
educational reformers and civil rights advocates pushed for 
these recommendations which caused a new wave of educational 
reforms that demanded an even greater accountability and 
effectiveness in education (Louis et al . , 2005; Melograno, 1994; 
Stiggins, 1991 ) . 

The No Child Left Behind Act 

The next reform movement that added to the past fifty years 
of test-based education was the bipartisan passed No Child Left 
Behind Act of 2001 (NCLB) . NCLB is a reauthorization of the 
Elementary and Secondary Act of 1965 (ESEA) , but instead of 
applying to only schools receiving Title I funding, NCLB applies 
to all public schools (Linn & Miller, 2005) . 

NCLB "...mandates that all states establish challenging 
academic content standards in academic subjects that: specify 
what children are expected to know and be able to do; contain 
coherent and rigorous content; and encourage the teaching of 
advanced skills" (Ormrod, 2006, p. 591). School districts must 
then annually assess the students to determine whether they are 
making "adequate yearly progress" (AYP) for every student 
including all racial and socioeconomic groups (Ormrod) . 




A Better Benchmark Assessment 



1 



The goal of NCLB is to have 100% of the students to reach 
the "proficient" level or higher by 2014 (Linn & Miller, 2005). 
If a school's AYP targets are not being met, then the "...school 
will be identified as 'needs improvement' and be subject to the 
sanctions that apply to schools so designated" (Linn & Miller, 

p. 10) . 

Many supporters of NCLB, state standards, and 
accountability insist that (if correctly incorporated) schools 
with focused, common curriculum and feedback have a better 
chance of promoting student learning (Porter, 2000). There is 
also research (Carnoy & Loeb, 2002) which points out that strong 
accountability programs exhibit greater student achievement 
gains . 

On the other spectrum of the issue, many educators agree 
that the NCLB forced standardized testing reduces the student's 
creativity and genius which can't be developed in an atmosphere 
of criticism, judgment, or evaluation that these types of 
assessments cause (Amabile, 1979; Armstrong, 1998; Krippner, 

1967) . 

Other critics (Linn & Miller, 2005; Stecher & Hamilton, 
2002) agree that the increased over-reliance on results from 
high-stake tests further distorts education by causing important 



objectives to be ignored by not being included in the standards 




A Better Benchmark Assessment 



1 



and tests that are counted. Additionally , these opponents claim 
that the increased scores are misleading because teachers often 
teach to the specifics of the test rather than the more general 
content standards (Linn & Miller; Stecher & Hamilton) . 

This issue of teaching to the test causes concerns among 
many educators on the reliability of the high-stakes test 
results (Koretz, 2005). This is concerning to educators 
especially after research (Camel & Chung, 2002, Koretz; Shepard, 
1989) verifies that teaching to the test works to improve 
scores. Additionally, many teachers under the pressures of such 
tests have changed their instructional practices to assist the 
students' ability to perform well through test taking strategies 
(Shepard; Vogler, 2005) . 

Another concerning influence of NCLB is that many states 
and their districts have negotiated their compliance, or lack- 
there-of, with the Department of Education by seeking waivers to 
specific provisions, while some states have even threatened 
outright rebellion. The Department of Education also has 
negotiated changes for individual states that have added 
complications to the whole process and interpretation of the law 
(Sunderman, 2006) . 

Despite the pros and cons and the diversity of thought 



regarding mandated testing and NCLB, there are many researchers 




A Better Benchmark Assessment 



20 



and educators working on potential solutions and systems to sift 
out the negative aspects of mandated testing and keep the 
positive (Furger, 2002; McElroy, 2006; Olson, 2005b; O'Shea, 
2006) . 

Benchmark Assessments 
History 

Many school districts' solution to the NCLB mandated high- 
stakes tests have been to develop a system called benchmark 
testing, also known as progress monitoring systems, or formative 
assessments (Herman & Baker, 2005). Benchmark assessments 
typically "are given periodically, from three times a year to as 
often as once a month; focused on reading and mathematics skills 
[or other core subjects] , taking about an hour per subject; 
reflecting state or district academic-content standards; and 
measures students' progress through the curriculum and/or on 
material in state exams" (Olson, 2005a, p. 14). 

Benchmarking is the wave of the present and future as 70% 
of superintendents surveyed in 2005 said they give periodic 
district-wide benchmark tests, and another 10% said they are 
planning to do so in the coming year (Olson, 2005a) . 

Because of this large movement to benchmark assessments, 
the assessment industry has jumped on this burgeoning market. 



Market-research has indicated that benchmark assessments are one 




A Better Benchmark Assessment 



21 



of two high-growth areas in the industry along side state 
mandated exams (Olson, 2005a) . Predictions indicate that by 2006 
the benchmark assessment industry will generate 323 million in 
annual revenues for vendors (Olson) . 

Positives and Negatives 

Research has shown to make an effective benchmark 
assessment the content standards should be assessed properly by 
effective alignment (Rothman, Slattery, Vranek, & Resnick, 

2002). Then the content standards should be efficiently 
prioritized. "Only those content standards determined to be of 
the highest priority and also measurable on a per-standard basis 
should be tested via large-scale assessments. The remaining 
standards should serve as targets for teachers' instruction and 
should be measured by classroom assessments" (Popham, 2000, p. 
30) . 

Other researchers (Neill, 2006; Olson, 2005a) have further 
concluded that benchmark assessments which are not summative but 
are to be used as a formative assessment which can have a 
powerful impact on students' achievement, especially with regard 
to low-achieving students. 

Some research has also concluded that in the development of 
benchmark assessments "...students who do well on one set of 



standardized tests do not perform as well on other measures of 




A Better Benchmark Assessment 



22 



the same content, suggesting that they have not acquired a deep 
understanding" (Olson, 2005a, p. 14). Therefore, good benchmark 
assessments should measure performance on the entire curriculum 
at a deep level of understanding (Olson) . 

Herman and Baker (2005) infer that there are six criteria 
which determine the validity of good benchmark tests: alignment 
to content standards, enhanced diagnostic value of assessment 
results through initial item and test structure design, fairness 
for all students including English language learners and 
students with disabilities, data showing technical quality, 
built in utility, and feasibility. 

On the other side of the issue of benchmarking, researchers 
have found some negative consequences of benchmark assessments 
in many school districts, for example benchmarking "leads to 
increased grade retention, which has repeatedly been proven to 
be counterproductive in terms of its effects on students" 

(Neill, 2006, p. 10). Another consequence of benchmarking is 
schools teaching to the test. "The higher the stakes on the 
examination, the more schools focus instruction on the tests 
themselves. Whole subjects, such as science, social studies, 
art, or physical education, may be reduced or eliminated if only 
the areas of language arts and mathematics are going to be 



tested" (Neill, p. 10-11). 




A Better Benchmark Assessment 



23 



Some educators insist that there is already too much 
testing and not enough instruction going on, with benchmarks 
becoming another mandated test getting thrown on to the 
assessment pile (McElroy, 2006; Olson, 2005b). 

Other educators even claim that benchmark systems, which 
contain the content standards, frameworks, and aligned- 
curriculum, still lack tools for teachers; the results of which 
have caused them to grown increasingly cynical and impatient 
with the required output of high test scores, better grades, and 
passing scores on the state or graduation exams (Olson, 2005b; 

0 1 Shea, 2006 ) . 

With benchmark assessments here to stay (Olson, 2005a) , the 
question now is which type of benchmark assessment would be the 
most beneficial in fulfilling all of the necessary criteria of a 
good benchmark assessment? For the purpose of this study, the 
possible benchmark assessment types fall into two general 
categories: multiple-choice assessments or project-based 
assessments . 

Multiple -Choice Assessments 
History 

Well after the end of World War I, essay and oral 
examinations were the normal form of assessment in the United 
States. The Army Alpha examination was one of the first large 




A Better Benchmark Assessment 



2 



scale multiple-choice tests developed and used during World War 
I for the purpose of assessing nearly two million men's 
aptitudes for selection and placement in the military. The army 
found a successful way to efficiently assess a large number of 
recruits, which ended up changing the nature of assessment in 
the education field within the United States ((Clarke et al . , 
2000; Madaus, 1993). 

After World War I, the education field and the newly 
developed test-publishing industry produced a number of 
achievement tests patterned after the Army Alpha's multiple- 
choice model. These tests could be given anywhere and did not 
require students to construct responses that would be costly and 
timely to administer and grade at a large scale. There was also 
ample evidence, at the time that performance on multiple-choice 
tests correlated well with performance on constructed response 
tests. For the next half century the multiple-choice assessment 
strategy was the norm, especially after many states began to 
require state-mandated minimum competency testing (Clarke et 
al., 2000; Madaus, 1993; Popham, 1993; Stiggins, 1991). 

Positives and Negatives 

Many educators and researchers have argued that multiple- 
choice assessments have their limitations. First, multiple- 



choice assessments tend to measure "whether the student knows or 




A Better Benchmark Assessment 



25 



understands what to do when confronted with a problem situation, 
but it cannot determine how a student actually will perform in 
that situation" (Linn & Miller, 2005, p. 196); second, multiple- 
choice items "requires selection of the correct answer, and 
therefore it is not well adapted to measuring some problem- 
solving skills... or to measure the ability to organize or 
present ideas" (Linn & Miller, p. 196); and third, the 
"difficulty of finding a sufficient number of incorrect but 
plausible distracters" (Linn & Miller, p. 196). 

Other educators and researchers (Bridgeman, 1992; Carey, 
1997; Lukhele, Thissen, & Wainer, 1994; Truckman, 1993) agree 
with Linn and Miller by concluding that multiple-choice 
assessments deny a student the ability to organize, synthesize, 
argue coherently, express knowledge in personal terms, and 
demonstrate creativity, in which a simple constructed assessment 
could accomplish all these items. They have further found that 
multiple-choice assessments discourage critical thinking and 
fail to attract students to science and industry or the student 
even views the course as a "numbers game" being more concerned 
with the testing process than the actual content. 

Other researchers have also found that while constructed 
response assessments develop concept learning, multiple-choice 
assessments are limited to generally creating detail 




A Better Benchmark Assessment 



26 



memorization (Martinez, 1999; Traub & MacRury, 1990). Also 
multiple-choice assessments possess gender and racial biases 
(Bell & Hay, 1987; Bolger & Kellaghan, 1990; Lunsden & Scott, 
1987) . 

Despite all the research against multiple-choice 
assessments there is evidence that multiple-choice assessments, 
if constructed correctly, can be just as effective as 
constructed response assessments (Burton, 2005; Simkin & 
Kuechler, 2005) . 

Many educators also conclude that multiple-choice 
assessments can do more than just give measurement of simple 
learning outcomes. They can assess a student's knowledge of 
terminology, specific facts (who, what, when, and where) , 
principles, methods and procedures, ability to identify 
applications of facts and principles, ability to interpret 
cause-and-ef f ect relationships, and the ability to justify 
methods and procedures (Linn & Miller, 2005, p. 187-194). 

In constructing good "multiple-choice type items [they] 
will tend to be of a higher quality than short-answer, true- 
false, or matching-choice items in the same area" (Linn & 
Miller, 2005, p. 196). 

In summary, multiple-choice verses constructive response. 



or other types of non-traditional assessments, has been found to 




A Better Benchmark Assessment 



27 



be generally inconclusive. Each form of assessment has their own 
pros and cons (Martinez, 1999, Simkin & Kuechler, 2005). 

Project-Based Assessments 
History 

There are several types of non-traditional assessments with 
the goal to assess what a student can do and the intention of 
applying their knowledge and skills to complex tasks inside or 
outside the classroom. Theses types of assessments, namely, 
project-based, authentic, or performance, have this similarity. 
These non-traditional assessments have gained increased 
popularity among educators today (Darling-Hammond, 1991; Lester, 
Lambdin, & Preston, 1997; Paris & Paris, 2001; Valencia, 

Hiebert, & Afflerbach, 1994). 

Project-based learning can be traced back as far as the 
early 1900s when noted American philosopher and educator, John 
Dewey, supported "learning by doing." This idea is also 
reflected in the educational theory of constructivism which 
"...explains that individuals construct knowledge through 
interactions with their environment, and each individual's 
knowledge construction is different. So, through conducting 
investigations, conversation or activities, an individual is 
learning by constructing new knowledge by building on their 



current knowledge" (Grant, 2002, p. 2). 




A Better Benchmark Assessment 



2 



Many researchers and educators believe that it is 
imperative that "teachers consider what [their] students should 
be able to do when they join the real world, and [their] 
assessment practices must, to some extent, reflect those real- 
life tasks" (Ormrod, 2006, p. 526). The reasoning behind the 
movement to these types of non-traditional assessments includes 
"the too frequent discontinuity between what occurs in the 
classroom and what students must do beyond provides [the] 
primary rationale ..." (Tanner , 2001, p. 25). 

Positives and Negatives 

Real-life experiences are just one aspect of the use of 
performance-based or project-based assessments. These non- 
traditional assessments have a greater use to promote student's 
learning and achievement were multiple-choice assessments are 
limited (Roberts & Harlin, 2005) . 

These non-traditional assessments are able to facilitate 
the following, which research has found to be the most effective 
in promoting student's learning and achievement: "Give a formal 
or informal pretest to determine where to begin instruction, 
choose or develop an assessment instrument that reflects the 
actual knowledge and skills a student should achieve, construct 
assessment instruments that reflect what how students should 



process information when they study, use as assessment task as a 




A Better Benchmark Assessment 



2 



learning experience in and of itself, use an assessment to give 
students specific feedback about what they have and have not 
mastered, and provide criteria that students can use to evaluate 
their own performance" (Ormrod, 2006, p. 528). 

There is also plenty of research that points to the fact 
that non-traditional assessments, such as project-based 
assessment, results is significantly higher test scores, passing 
rates, student engagement, knowledge retention, and classroom 
attendance (Bartscher, Gould, & Nutter, 1995; Ferretti, 
Macarthur, & Okolo, 2001; Mehta & Kou, 2005; Railsback, 2002). 

Project-based or performance-based assessments also have 
many advantages such as: "...clear communication of 
instructional goals that involve complex performances in natural 
settings in and outside of school, measure complex learning 
outcomes that cannot be measured by other means, provides a 
means of assessing process or procedure as well as the product 
that results from performing a task, and implementation of 
modern learning theory approaches that reach students at a 
affective level" (Linn & Miller, 2005, p. 257). 

Despite all the advantages of these types of assessments, 
there are limitations. The most common limitation is the 
unreliability of ratings of performances across teachers or 
across time for the same teacher. Another limitation is that 




A Better Benchmark Assessment 



30 



performance or project-based assessments are time-consuming. 
Students need ample time to perform each task which could limit 
the amount of curriculum covered (Burstein, 1994; Linn & Miller, 
2005) . 

Shepard (1995) also notes that the implementation of non- 
traditional assessments into high-stakes testing systems could 
result in the same issues that the critics of multiple-choice 
based high-stakes tests argue. Shepard comments that "even 
authentic measurements are corruptible and when practiced for, 
can distort curriculum and undermine professional autonomy" (p. 
38) . 

Dar ling-Hammond (1994) further agrees with Shepard (1995) 
and adds that "alternative assessment methods, such as 
performance-based assessment, are not inherently equitable, and 
that educators must pay careful attention to the ways that the 
assessments are used" (p. 5). Darling-Hammond then argues that 
"the equitable use of performance assessments depends not only 
on the design of the assessments themselves, but also on how 
well the assessment practices are interwoven with the goals of 
authentic school reform and effective teaching" (p. 5). 

Benchmarks: The ACES 

The mandated standardized testing required by NCLB, which 



caused the development and use of benchmark assessments, and the 




A Better Benchmark Assessment 



31 



debate over multiple-choice versus authentic project-based 
assessments, resulted in a benchmark assessment program that was 
adopted by the Delano Joint Union High School District. 

DJUHSD has a series of benchmark assessments called 
Assessment of Core Exit Standards (ACES) which are designed 
through cohort collaboration within the district. James Hay, 
Director of Support and Assessment Services (personal 
communication, May 10, 2006) stresses that the ACES or the ACES 
system is broader than just a benchmark assessment. The ACES are 
a whole curriculum aligned to the California Content Standards 
in which a scope and sequence is developed which the teacher is 
to follow. The content of the actual ACES assessments are 
aligned to the scope and sequence. 

Hay (personal communication. May 10, 2006) also noted that 
the ACES serve three main functions: to assess students' 
progress toward mastery of the content standards, to provide a 
minimum level of mastery and accountability for the student 
before they are able to pass the particular class, and to hold 
the teacher accountable to teach the California Content 
Standards . 

Hay (personal communication. May, 10, 2006) further noted 
that there are four ACES given per semester. A student has 



several chances to pass all four by obtaining a score of 70% or 




A Better Benchmark Assessment 



32 



more. At the beginning of the semester the students are given 
the final (all four combined ACES) as a pretest. This pretest is 
not scored toward a student's ability to fulfill the 
requirement. Throughout the semester, the ACES are given after 
the teacher covers the associated instructional unit. 

Hay (personal communication, May, 10, 2006) also said that 
if a student does not pass the ACES on the first try they are 
allowed one retake within two weeks after going to tutorial 
provided by each department. If a student still does not pass an 
ACES they can still pass the class by receiving a score of 70% 
or higher on the final, which is all four ACES combined. 

Rodger Graf, Head of the Social Science Department at Cesar 
E. Chavez High School (personal communication. May 18, 2006), 
noted that the ACES are not scientifically developed or is their 
research on their validity or reliability, but the process of 
development (cohort collaboration) allows for constant 
refinement and adjustments to the ACES assessments and the 
actual scope and sequence. Graf prefers this method over the 
district paying large amounts of money for a private company to 
develop a curriculum that would be difficult and expensive to 
adjust . 

Because of this cohort collaboration in the development of 
the ACES assessments, the separate committees of teachers and 




A Better Benchmark Assessment 



33 



administrators , which developed the ACES assessments, came up 
with different models of assessing. Graf (personal 
communication. May 18, 2006) noted that this is why the World 
History ACES are multiple-choice based (see Appendix B for 
examples) and the U.S. History ACES are project-based (see 
Appendix A for examples). Both forms of assessment are based on 
the scope and sequence and the California Content Standards, but 
the logic behind the multiple-choice ACES assessments was to be 
more similar to the CST. The logic behind the project-based ACES 
assessments was to hopefully reach the students at a deeper 
metacognitive level, therefore gaining deeper knowledge that 
would reflect on the CST. 

Graf (personal communication. May 18, 2006) stressed that 
many teachers have issues with the different forms of 
assessments. Some feel the project-based assessments are limited 
for several reasons: the ability to conduct project-based 
research is limited due to the majority of students are socio- 
economically disadvantaged, the lack of resources at the 
different school sites, and the project-based assessments aren't 
valid because they do not assess what they are suppose to 
assess. Other teachers feel that the multiple-choice assessments 
allow the teachers to teach to the test and the students only 



learn at a basic cognitive level. 




A Better Benchmark Assessment 



34 



Graf (personal communication, May 18, 2006) noted that some 
of the criticisms are alleviated because the ACES system allows 
flexibility in how the assessments are given. If teachers want 
to add an appendix to the test for their college prep or honors 
classes they can take that liberty. But, Graf added that the 
passing or not passing of the ACES is strictly based on the 
district ACES not on anything else a teacher may add. The 
teacher also has the liberty to work the grade of the ACES into 
their grading system any way they choose. 

Graf (personal communication. May 18, 2006) also noted that 
there are some positive outcomes of the ACES; they have forced 
the teachers to teach the California Content Standards. Graf 
gave this example, "Before the ACES, if a teacher of U.S. 

History really liked the Civil War they would spend months on it 
and would cut out important standards through the rest of the 
course. The ACES have eliminated this kind of teaching." 




A Better Benchmark Assessment 



35 



CHAPTER III: METHODOLOGY 
Description of the Research 
This study compares the passing rate of ACES in World 
History courses, which are multiple-choice based, to the same 
students' CST results in the World History section of the test. 
This correlation is also made for the United States History 
courses, which have project-based ACES. The percentages from 
both groups are compared to determine the more effective type of 
test that 1) can be used as an effective indicator of a 
student's score on the CST and 2) was effectively able to assess 
the content standards and therefore, have general success on the 
CST. 

Research Design 

The research design of this study was an Ex Post Facto 
descriptive study . 

Selection of Subjects 

The subjects of this study were all the 10 th grade world 
history students and all the 11 th grade U.S. history students 
that took the World History or U.S. History portion of the CST 
that attended one of the two comprehensive high schools (Cesar 
E. Chavez High School or Delano High School) within the Delano 
Joint Union High School District in the 2004-2005 school year. 



In addition to the above criteria, the students selected had to 




A Better Benchmark Assessment 



36 



be enrolled in either school for the majority of the year and 
have taken the district ACES benchmark tests. 

Data Gathering 

The data in this study was gathered from the 2004-2005 
DJUHSD ACES reports and the 2005 DJUHSD CST reports with 
permission of James Hay, DJUHSD Support and Assessment Services 
Director and Bonnie Armendariz, DJUSHD IT Director. 

Data Analysis 

The data collected was analyzed through a comparison of 
percentages to the results of the World History and U.S. History 
ACES to the World History and U.S. History CST. 

Limitations 

The 2004-2005 school year was the third year the ACES were 
used in the DJUHSD, but the first year the district kept 
records. Before this, the teachers kept their own records of 
which students passed or failed. Because of the limited amount 
of data available, it is impossible to do any kind of study over 
the several years the ACES have been used. Another factor in 
regards to this is that the 2005-2006 ACES records are 
available, but the CST results for the same school year are not 
yet available. Therefore, currently this study can only be 



conducted with the 2004-2005 data. 




A Better Benchmark Assessment 



37 



This study was also limited in that the ACES are only 
reported as pass or fail (70% score or better is considered 
passing) while the CST results are broken down further 
("advanced," "proficient," "basic," "below basic," and "far 
below basic") . Because of this, it makes it difficult to conduct 
a detailed comparison between the achievement levels. 

Another limitation of this study was the instructional 
freedom that each teacher and each department had in their 
actual methodology in teaching the scope and sequence and the 
administration of the ACES themselves. CCHS also is a new campus 
with limited resources, compared to DHS that has existed since 
the early 1900s. Because of this, some of the project-based 
assessments were a slightly different variation at CCHS compared 
to DHS. Therefore, complete and accurate comparisons between the 
two high schools in the district are somewhat limited. 

Another important limitation in this study is that many of 
more of the teachers at DHS participated in the creation of the 
ACES assessments, which gives them an instructional advantage 
over the teachers at CCHS. Along these lines, the 2004-2005 
school year was also the first year the CCHS had a junior class, 
therefore, the first time many of the teachers at CCHS taught 
using the U.S. history scope and sequence and the U.S. history 



ACES assessments. 




A Better Benchmark Assessment 



38 



Another important limitation of this study is that the ACES 
results are graded and entered into the district database by the 
individual teachers. Because of the teacher's control, the 
passing of a student is based on a teacher's honor and general 
discretion. Also each teacher's grading criteria or rubrics of 
the project-based assessments vary. 




A Better Benchmark Assessment 



3 



CHAPTER IV: DATA PRESENTATION AND ANALYSIS 
Presentation of Data 

The percentages of students with a passing score (70% or 
more) on the ACES in World History are as follows: DHS, 54.8%; 
CCHS, 73.5%; for an average of 59.9%. The percentages of the 
passing rates of the ACES in U.S. History are as follows: DHS, 
52.4%; CCHS, 70.1%; for an average of 58.7% (Also see graph 
version of data in Figure 1) . 

The percentages of students with a "proficient" or above 
score on the World History portion of the CST are as follows: 
DHS, 21%; CCHS, 18%; for a district average of 19.5%. The 
percentages of students with a "proficient" or above score on 
the U.S. History portion of the CST are as follows: DHS, 37%; 
CCHS, 21%; for a district average of 29% (Also see graph version 
of data in Figure 2 for the complete CST break-down) . 

The percentages of students with a "basic" or above score 
on the World History portion of the CST are as follows: DHS, 

59%; CCHS, 57%; for an average of 58%. The percentages of 
students with a "basic" or above score on the U.S. History 
portion of the CST are as follows: DHS, 67%; CCHS, 55%; for an 
average of 61% (Also see graph version of data in Figure 2) . 




A Better Benchmark Assessment 



40 



Figure 1 



2005 DJUHSD ACES Passing Percentages 




History History History History History History 



□ Students who passed all ACES 
requirements. 

■ Students who did not pass all 
ACES requirements. 



A Better Benchmark Assessment 



41 



Figure 2 

2005 DJUHSD CST Proficiency Percentages 




Basic Basic 



□ CCHS World History CST 

□ CCHS U.S. History CST 

□ DHS World History CST 
n DHS U.S. History CST 

■ DJUHSD World History CST 

□ DJUHSD U.S. History CST 



A Better Benchmark Assessment 



42 



Figure 3 

Passed ACES to "Proficient" CST Comparisons 




Passed ACES: Passed ACES: Proficient or Proficient or 
World History U.S. History Above CST: Above CST: 

World History U.S. History 



□ CCHS 
■ DHS 

□ DJUHSD 



A Better Benchmark Assessment 43 
Analysis of Results 

In comparison with each high school, the data shows that 
CCHS had 18.7% more students pass the World History ACES and 
17.7% more pass the U.S. History ACES. But, DHS had 3% more 
students reach the "proficient" or above level on the World 
History portion of the CST and 16% more reach the same level on 
the CST U.S. History portion (Also see graph version of data in 
Figure 3 ) . 

As a district, the comparison of the passing rate of the 
ACES to the "proficient" or above level of the CST is as 
follows: 40.4% less students reached the desired level on the 
CST World History portion than passed the World History ACES; 
and 29.7% less students reached the desired level ("proficient") 
on the CST U.S. History portion than passed the U.S. History 
ACES . 

Comparison of the data of the students who reached the 
"basic" level, which is considered by the CST as "approaching 
state standards," brought a much different result. First the 
comparison between the two schools which students reached 
"basic" or above on the World History portion of the CST 
resulted in the following: CCHS, 57%; DHS, 59%. The same 
comparison, but for the U.S. History portion of the CST is: 



CCHS, 55%; DHS, 67%. The average number of students, of both 




A Better Benchmark Assessment 



44 



high schools, who reached "basic" or above on the CST for the 
World History portion of the CST is 58%, while the same 
criterion of the U.S. History portion of the CST is 61% (Also 
see graph version of data in Figure 4) . 

The statistical difference of students who reached the 
"basic" or above level on the history portions of the CST in 
comparison with the percentage of students who passed the ACES 
are as follows (more students passed the ACES then the CST 
unless otherwise noted): CCHS World History, 16.5%; DHS World 
History, 4.2% (more met the CST levels than passed the ACES); 
DJUHSD World History, 1.9%; CCHS U.S. History, 15.1%; DHS U.S. 
History, 14.6% (more met the CST levels than passed the ACES); 
DJUHSD U.S. History, 2.3% (more met the CST levels than passed 
the ACES. Also see graph version of data in Figure 4). 

Summary 

Considering the scope and sequence and the ACES are 
generally the same between the two campuses, there is a 
discrepancy in their passing rates, nearly 20% for each ACES. As 
a district, there is also a large discrepancy between the ACES 
passing rates and the number of students that reached the 
"proficient" or above levels (around 35% less students reached 
the "proficient" level on the CST than passed the ACES), which 



was one of the goals of the ACES. 




A Better Benchmark Assessment 



45 



The data does show that the number of students that passed 
the ACES, from both campuses and as a district average, are more 
closely related (only a 1.9% to 2.3% difference) to the amount 
of students that reached the "basic" or above level on the CST. 




A Better Benchmark Assessment 



46 



Figure 4 

Passed ACES to "Basic" CST Comparisons 




World History History CST: World CST: U.S. History 

History 



□ CCHS 
BDHS 

□ DJUHSD 



A Better Benchmark Assessment 



47 



CHAPTER V: CONCLUSIONS AND RECOMMENDATIONS 

Conclusions 

In answering the question of which ACES is a better gauge 
on the student's ability to reach "proficient" or above on the 
CST, the research shows that (as a district) the difference in 
percentage between students' passing the ACES and reaching the 
desired level on the CST were as follows: World History ACES, 
44.7%; U.S. History, 32.3%. Therefore, the U.S. History project- 
based ACES are 12.4% better at predicting student success on the 
CST than the World History multiple-choice based ACES. 

Although the project-based ACES are closer in relation to 
the adjoining CST desired results of reaching the "proficient" 
level or above, the U.S. History project-based ACES are still 
32.3% off from the same amount of students that actually 
obtained those levels on the CST. 

What the data does verify is that the U.S. and World 
History ACES at both campuses are accurate at predicting the 
number of students that will reach the level below "proficient," 
which is "basic." The data shows that the difference is only 
around 2%. Therefore, the ACES can be used successfully to gauge 
students' potential on the CST at the "basic" level. 

The data gathered also shows major differences in the 



passing rates of both types of ACES from the two campuses with 




A Better Benchmark Assessment 



48 



CCHS passing around 70% on both types of ACES and DHS passing 
around 55%. The result of which can be concluded that despite 
the continuity between the scope and sequence and generally the 
ACES themselves, there are differences in teaching methodology, 
focus, style, and teaching to the ACES or teaching to the state 
standards, but also possible leniency issues in adherence to the 
district ACES policies. 

Implications and Inferences 

This research is not only valuable to the DJUHSD and the 
individual campuses of the district, but to numerous other 
school districts that are attempting to formulate effective 
benchmark assessments without turning to expensive programs from 
the educational assessment-making industry. Although the ACES 
are unsuccessful in properly gauging student success at the 
"proficient" or above level on the CST, the ACES do, however, 
properly detect the "basic" level right below it. 

To adjust the ACES to properly assess the "proficient" or 
above levels, the ACES should be modified to serve more as a 
formative assessment opposed to the current form as a summative 
assessment. Research has concluded (Neill, 2006; Olson, 2005a; 
Popham, 2000) that for benchmark assessments to be effective 
they need to be formative, which allows the teacher to evaluate 
while the content is being taught and re-teach if needed. The 




A Better Benchmark Assessment 



49 



DJUHSD should also follow Herman and Baker's (2005) six criteria 
to effective benchmark assessments. 

If these adjustments are followed the ACES assessments 
should engage the students properly, have the formative 
functionality, and possess the rigger required to properly gauge 
the student as the "proficient" or above level. 

In regards to which type of assessment, multiple-choice or 
project-based benchmark assessments are more effective in their 
own right, the data in this study is rather inclusive. But, the 
data does show that the U.S. History project-based ACES results, 
at DHS, had the closest relation out of all the types of ACES at 
the two campuses. The difference in the passing rate was only 
15.4%. In comparison the same ACES, at CCHS, gave a difference 
of 49.1%. Why the difference? 

The answer can be ascertained through the stated literature 
concerning the ACES. The ACES at CCHS were modified from the 
originals used at DHS because CCHS is a new campus (3 years old 
at the time) with limited resources. For example, a majority of 
social science teachers at DHS had computers with internet 
access either in their classrooms or had ample access to them. 

On the other hand, CCHS had very limited access to similar 



computers. Because of this, the CCHS history ACES were modified 




A Better Benchmark Assessment 



50 



to allow all research on those project-based ACES to be 
completed through the sole use of the available textbook. 

What this ACES modification created at CCHS was an 
ineffective project-based assessment, going against what 
research has revealed to be an effective project-based or an 
authentic assessment (Bartscher et al . , 1995; Ferretti et al . , 
2001; Linn & Miller, 2005; Mehta & Kou, 2005; Ormrod, 2006; 
Railsback, 2002; Roberts & Harlin, 2005). 

The issue of the discrepancy in the passing percentages 
between the two campuses' ACES results, after looking at the 
data and the literature, was most likely the result of teachers 
at CCHS formulating their instruction to the content of 
benchmark assessments themselves, or in other words teaching to 
the test. Research has shown that teaching to the test does, 
indeed, work for a particular assessment but does not work 
across different assessments, though the content is similar 
(Neill, 2006; Olson, 2005a). This can also be concluded by the 
fact that DHS had around 15% fewer students pass the ACES than 
CCSH, but had around 10-15% more students score "proficient" or 
above on the CST. It could be concluded that DHS did not align 
their instruction strictly to the ACES, but more toward the 



state content standards. 




A Better Benchmark Assessment 51 
Recommendations for Further Study 
To get accurate results in comparing the multiple-choice 
and the project-based ACES and the CST results, a study would 
need to be conducted over several years with statistical data. 
Along this line of formulating statistical accuracy, there would 
also need to be better criterion and relative consistency and 
accuracy established in the development of the ACES themselves 
between the two campuses (and by 2008 the third campus, Robert 
F. Kennedy High School) . 

Another recommendation would be to conduct a survey of the 
teachers at DHS and CCHS to determine how they implement the 
scope and sequence, how they prepare the students for their 
respected ACES, how they prepare the students for the CST, and 
their grading/evaluation procedures of the ACES. This survey 
would be beneficial to analyze the differences in passing rates 



between teachers and between campuses within the district. 




A Better Benchmark Assessment 



52 



References 

Amabile, T. (1979). Effects of external evaluation on artistic 
creativity. Journal of Personal and Social Psychology, 

37(2 ) , 221-233 . 

Armstrong, T. (1998). Awakening Genius in the Classroom. 

Alexandria, VA: Association for Supervision and Curriculum 
Development . 

Bartscher, K., Gould, B., & Nutter, S. (1995). Increasing 

student motivation through project-based learning . 
Unpublished masters thesis. Saint Xavier University. 

Bell, R. C., & Hay, J. A. (1987). Differences and biases in 

English language examination formats. British Journal of 
Educational Measurement, 28, 77-92. 

Bolger, N., & Kellaghan, T. (1990). Methods of measurement and 
gender differences in scholastic achievement. Journal of 
Educational Measurement, 27, 165-174. 

Bridgeman, B. (1992). A comparison of quantitative questions in 
open-ended and multiple-choice formats. Journal of 
Education Measurements, 29, 253-271. 

Burstein, L. (1994). Performance-Based Assessment for 

Accountability Purposes : Taking the Plunge and Assessing 
the Consequences . Los Angeles, CA: National Center for 



Research on Evaluation, Standards, and Student Testing. 




A Better Benchmark Assessment 



53 



Burton, R. F. (2005). Multiple-choice and true/false tests: 

Myths and misapprehensions. Assessment & Evaluation in 
Higher Education, 30(1), 65-72. 

Camel, C., & Chung, T. (2002). Circumventing the pressures of 
standardized norm-referenced tests. Unpublished masters 
thesis Saint Xavier University. 

Carey, J. (1997). Everyone knows that E=MC 2 now, who can explain 
it? Business Week, 3547, 66-68. 

Carnoy, M., & Loeb, S. (2002). Does external accountability 
affect student outcomes? A cross-state analysis. 

Educational Evaluation and Policy Analysis, 24, 305-331. 

Clarke, M. M., Madaus, G. F., Horn, C. L., & Ramos, M. A. 

(2000). Retrospective on educational testing and assessment 
in the 20th century. Journal of Curriculum Studies, 32(2), 
159-181 . 

Darling-Hammond, L. (1991). The implications of testing policy 
for quality and equality. Phi Delta Kappan, 73, 220-225. 

Darling-Hammond, L. (1994). Performance-based assessment and 

educational equity. Harvard Educational Review, 64(1), 5- 
30. 

Ferretti, R. P., Macarthur, C. D., & Okolo, C. M. (2001). 

Teaching historical understanding in inclusive classrooms. 



Learning Disability Quarterly, 24, 59-71. 




A Better Benchmark Assessment 



54 



Furger, R. (2002). Assessment for understanding. Retrieved June 
5, 2006, from The George Lucas Educational Foundation Web 
Site : http : / /www . edutopia . org/ php/ article . php?id=Art_93 7&key 
= 005 

Grant, M. M. (2002). Getting a grip on project-based learning: 
Theory, cases, and recommendations. Meridian, 5(1), 1-17. 

Herman, J. L., & Baker, E. L. (2005). Making benchmark testing 

work. Educational Leadership, November, 48-54. 

Koretz, D. (2005). Alignment, High Stakes, and Inflation of Test 
Scores. Los Angeles, CA: National Center for Research on 
Evaluation, Standards, and Student Testing. 

Krippner, S. (1967). The Ten Commandments that block creativity. 
Gifted Child Quarterly, 11(2), 144-156. 

Lester, F. K., Lambdin, D. V., & Preston, R. V. (1997). A new 

vision of the nature and purposes of assessment in the 
mathematics classroom. In G.D. Phye (Ed.), Handbook of 
Classroom Assessment : Learning, Achievement , and 
Adjustment. San Diego, CA: Academic Press. 

Linn, R. L., & Miller, M. D. (2005). Measurements and Assessment 
in Teaching (9th ed.). Upper Saddle River, NJ : Pearson 



Merrill Prentice Hall. 




A Better Benchmark Assessment 



55 



Louis, K. S., Febey, K., Sc Schroeder, R. (2005). State-mandated 
accountability in high schools: Teachers 1 interpretations 
of a new era. Educational Evaluation and Policy Analysis , 
27(2) , 177-204. 

Lukhele, R., Thissen, D., & Wainer, H. (1994). On the relative 
value of multiple-choice, constructed response, and 
examinee-selected items on two achievement tests. Journal 
of Education Measurement, 31(3), 234-250. 

Lunsden, K. G., & Scott, A. (1987). The economics student 
reexamined: Male-female differences in comprehension. 
Journal of Economic Education, 18(4), 365-375. 

Madaus, G. F. (1993). A national testing system: Manna from 

above? A historical/technological perspective. Educational 
Assessment, 1(1), 9-26. 

Martinez, M. E. (1999). Cognition and the question of test items 
format. Educational Psychologist, 34(4), 207-218. 

McElroy, E. J. (2006). Its time to get smart about teaching. 
American Teacher, 90 { 2), 2. 

McNeil, L. (2000). Contradictions of School Reform: Educational 



Costs of Standardized Testing. New York: Routledge/Falmer . 




A Better Benchmark Assessment 



56 



Mehta, S., & Kou, Z. (2005). Research in statics: Do active , 

collaborative, and project-based learning methods enhance 
student engagement , understanding, and passing rate? Paper 
presented at the meeting of the American Society for 
Engineering Education Annual Conference and Exposition. 
Chicago, IL. 

Melograno, V. J. (1994). Portfolio assessment: Documenting 

authentic student learning. Journal of Physical Education, 
65(8) , 50-61 . 

Neill, M. (2006). Preparing teachers to beat the agonies of 
NCLB . The Education Digest, 71(8), 8-12. 

Olson, L. (2005, November 30). Benchmark assessments offer 

regular checkups on student achievement. Education Week, 

25, 13. 

Olson, L. (2005, November 30). Not all teachers keen on periodic 
tests. Education Week, 25, 13. 

Ormrod, J. E. (2006). Educational Psychology: Developing 

Learners (5th ed.). Upper Saddle River, NJ : Pearson Merrill 
Prentice Hall. 

O’Shea, M. R. (2006). Beyond compliance: Steps to achieving 



standards. Principal Leadership, 6(8), 28-31. 




A Better Benchmark Assessment 



57 



Paris, S. G., & Paris, A. H. (2001). Classroom applications of 
research of self-regulated learning. Educational 
Psychologist, 36, 89-101. 

Popham, W. J. (1993, February). Circumventing the high costs of 
authentic assessment. Phi Delta Kappan, 74(6) , 470-474. 

Popham, W. J. (2000). Assessing mastery of wish-list content 
standards. NASSP Bulletin, 84 { 620), 30-36. 

Popham, W. J. (2004, November). "Teaching to the test" an 

expression to eliminate. Educational Leadership, 82-83. 

Porter, A. (2000). Doing high stakes assessment right. School 
Administrator , 57(11), 28-31. 

Railsback, J. (2002). Project-Based Instruction: Creating 

Excitement for Learning . Portland, OR: Northwest Regional 
Educational Laboratory . 

Roberts, T. G., & Harlin, J. F. (2005, March/April). Evaluating 

"doing to learn" activities: Using performance-based 
assessments. The Agricultural Education Magazine, 77(5), 
27-28 . 

Rothman, R., Slattery, J. B., Vranek, J. L., & Resnick, L. B. 

(2002) . Benchmarking and Alignment of Standards and 
Testing . Los Angeles, CA: National Center for Research on 



Evaluation, Standards, and Student Testing. 




A Better Benchmark Assessment 



58 



Sheldon, K. M., & Biddle, B. J. (1998). Standards, 

accountability, and school reform: Perils and pitfalls. 
Teachers College Record, 100, 164-180. 

Shepard, L. A. (1989, April). Why we need better assessments. 
Educational Leadership, 46(1), 4-9. 

Shepard, L. A. (1995, February). Using assessment to improve 
learning. Educational Leadership, 52(5), 38-43. 

Simkin, M. S., & Kuechler, W. L. (2005). Multiple-choice tests 

and student understanding: What is the connection? Decision 
Sciences Journal of innovative Education, 3(1), 73-97. 

Stecher, B. M., & Hamilton, L. S. (2002). Putting theory to the 

test: Systems of educational accountability should be held 
accountable. Rand Review, 26(1), 16-23. 

Stiggins, R. J. (1991). Facing the challenges of a new era of 
educational assessment. Applied Measurement in Education, 
4(4) , 263-273. 

Sunderman, G. L. (2006). The Unraveling of No Child Left Behind: 
How Negotiated Changes Transform the Law. Cambridge, MA: 



The Civil Rights Project at Harvard University. 




A Better Benchmark Assessment 



59 



Swanson, A. D., & Collins, J. L. (1999). Benchmarking: A study 
of school and school district effect and efficiency . 
Unpublished doctoral dissertation. State University of New 
York at Buffalo. Buffalo, NY: Graduate School of Education 
Publications . 

Tanner, D. E. (2001). Authentic assessment: A solution, or part 
of the problem? High School Journal, 85(1), 24-29. 

Traub, R. E., & MacRury, K. (1990). Multiple-choice versus free- 
response in testing of scholastic achievement. Test and 
Trends, 8, 128-159. 

Truckman, B. W. (1993). The essay test: A look at the advantages 
and disadvantages. NASSP-Bulletin, 77(555), 20-26. 

Valencia , S. W., Hiebert, E. H., & Afflerbach, P. P. (1994). 
Realizing the possibilities of authentic assessment: 

Current trends and future issues. In S.W. Valencia, E. H. 
Hiebert, & P.P. Afflerbach (Eds.), Authentic Reading 
Assessment: Practices and Possibilities. Newark, DE : 
International Reading Association. 

Vogler, K. E. (2005). Impact of a high school graduation 
examination on social science teachers 1 instruction. 



Journal of Social Science Research, 29(2), 19-33. 




A Better Benchmark Assessment 



60 



Appendix A 
Project-Based ACES 

U.S. HISTORY ACES #1: INFORMATION BOARD RESEARCH PROJECT 

Name Period Date T eacher 

California State Standards Assessed: 



11.1.1 Describe the Enlightenment and the rise of democratic ideas as the context in which the nation was founded. 

1 1 .1 .2 Analyze the ideological origins of the American Revolution, the Founding Fathers' philosophy of divinely bestowed 
unalienable natural rights, the debates on the drafting and ratification of the Constitution, and the addition of the Bill of 
Rights. 

1 1 .1 .3 Understand the history of the Constitution after 1787 with emphasis on federal versus state authority and growing 
democratization. 

Directions: 



1 . Explain the historical background and give a synopsis of the following three documents: the 
Declaration of Independence, the U.S. Constitution, and the Bill of Rights (a half page for each 
document). 

2. Create a time-line with a minimum of 15 events from the list below. 

3. Pick at least five of these events and relate them to the United States Constitution and/or the Bill of 
Rights. 

4. Prepare to present your information to the class in an oral presentation. 

Grading Guidelines: 



Historical Accuracy 


30 points 


Knowledge of historical content 


30 points 


Time Line 


1 5 points 


Creativity 


10 points 


Maintain Eye Contact 


5 points 


Use of visual Aids 


5 points 


Speaking and Delivery Style 


5 points 



Your overall evaluation 
must be 70% or greater to 
pass this ACES. 



Events in US History: 

*A11 events are in the textbook Americans. 



The Enlightenment (p. 34) 

Great Awakening (p. 35) 

Common Sense (p. 52) 

Federalists (p. 69) 

Anti-Federalists (p. 69) 
Democrat-Republicans (p. 76) 

George Washington (the 1st President) (p. 74) 
Alexander Hamilton (p. 75) 

Thomas Jefferson (p. 1 12) 

The XYZ affair (p. 78) 

The Alien and Sedition Act (p. 78) 

Marbury v. Madison (p. 113 & 118) 



The Louisiana Purchase (p. 114) 

The War of 18 12 (p. 114) 

Monroe Doctrine (p. 117) 

The Missouri Compromise (p. 122) 

Jacksonian Democracy (p. 122) 

The Nullification Crisis (p. 124 & 128) 

Indian Removal (The Trail of Tears) (p. 124) 

Texas Independence (the War with Mexico) (p. 133) 
Commonwealth v. Hunt (p. 143) 

Abolitionism (anti-slavery) (p. 144) 

Seneca Falls (suffrage) (p. 147) 



A Better Benchmark Assessment 



61 



Appendix A continued 



ACES #3: Industrial Times Newspaper 

California State Standard 11.2: Students analyze the relationship among the rise of industrialization, large-scale 
rural-to-urban migration, and massive immigration from Southern and Eastern Europe (1865 to 1920). 

Directions: 



Take one-sheet of blank unlined paper, turn on side, and fold sideways in half. This will create four pages that 
you will fill in to make your newspaper. (Fold paper at dotted line) 



Grading Outline: 






Completeness of Project 


40 


points 


Creativity of Project 


20 


points 


Historical Accuracy 


20 


points 


Neatness of Project 


20 


points 


Total points possible: 


100 


points 



Page 1 

(1) Make a title for your newspaper 

(2) What were the industries that grew during the United States Industrial Revolution? (Use Ch. 6, Sections 1, 2, 

& 3 ) 

(a) Pick one of the four major industries of the industrial revolution (steel, railroads, oil, or banking) 

(b) Describe what that industry does and who controlled that industry (Carnegie, Vanderbilt, 

Rockefeller, or Morgan) 

(c) Show what that person did to make the company grow (trusts on p. 243, horizontal and vertical 

integration on p. 242) 



Page 2 

(1) How the United States urban populations change during the Industrial Revolution? 

(Chapter 7, Sections 1 & 2) 

(a) Make a map or pie chart that details effects of immigration on the U.S. (p. 255, 263) 

(b) Draw a political cartoon that shows how immigrants were viewed by native-born Americans 

(nativism, p. 258-259) 



Page 3 

(1) Who were the Populists and the Progressives and what did they do to help the workers? 

(a) Draw a chart showing the changes wanted by the Populist Party and what were the results (Chapter 5, 

Section 3) 

(b) Draw a chart with the left side showing the four goals of the Progressive movement and the right side 

showing the names of the people (or groups) who fought for the change (Chapter 8, Section 2 and 
Chapter 9, Section 1, 2, 3, 4) 



Page 4 

(1) What were the differences between the industrialists and the workers? (Chapter 6, Section 3 and Chapter 7, 
Section 2) 

(a) Write one letter to the editor from a business leader (Carnegie, Rockefeller, Morgan, or Vanderbilt) 

that explains why they were rich (Social Darwinism, p. 242) 

(b) Letter two is from an average worker that explains the living and working conditions of the poor (p. 

244-245) 



A Better Benchmark Assessment 



62 



Appendix A Continued 



ACES #5: Jazz Age Newspaper 

California State Standard 11.5 

Students analyze the major political, social, economic, technological, and cultural developments of the 1920s. 
Directions : 

Use four sheets of blank unlined paper. Each paper will be each page of the newspaper as follows: 

Page 1 

(1) Make a title for your newspaper 

(2) What was the Harlem Renaissance? 

(Use Chapter 13, Section 4) 

(a) A map of the “Great Migration” 

(b) Describe the different type of artistic expression, including the name of at least three artists. 

(c) Include a poem from Langston Hughes (page 459) 

Page 2 

(1) How did the 18 th Amendment affect the United States? 

(Chapter 13, Section 1) 

(b) Draw a political cartoon that shows an aspect of Prohibition (bootlegging, speakeasies, 
organized crime, or law enforcement) 

Page 3 

(1) What were the changes in everyday life of the people of the United States? 

(Chapter 13, Section 3) 

(a) Write sports report about a popular sport during the 1920’s 

(b) Write an advertisement for a popular product (car, radio, etc.) of the 1920’s 

Page 4 

(1) What are the major issues in people’s lives? 

(a) Write one letter to the editor from a flapper who talks about her new freedoms. (Chapter 13, 
Section 2) 

(b) Write a second letter from A. Mitchell Palmer explaining how the communists are trying to 
destroy America. (Chapter 12, Section 1) 



Grading Criteria : 

Historical Accuracy/Completeness of Project 60 pts 

Artistic Creativity 20 pts 

Neatness 20 pts 



Total 



100 pts 



A Better Benchmark Assessment 



63 



Appendix B 

Multiple-Choice Based ACES 



Revised 1/25/ 2005 



DO NOT WRITE ON THIS TEST 

GRADE 10 

World History/ Geography 
ACES #5 

CLUSTER 3: CAUSES, COURSE AND EFFECTS OF THE FIRST WORLD WAR 

Multiple Choice 

(Standard 10.5:Causes and Course of the First World) 

1 . Whose assassination was the “spark” that started World War I? 

A. President Woodrow Wilson 

B. Archduke Franz Ferdinand 

C. Czar Nicholas I 

D. Kaiser Wilhelm I 

2. All of the following added to the rivalry between many European nations prior to World War I, 
EXCEPT: 

A. Nationalism 

B. Communism 

C. Militarism 

D. Imperialism 

3. Some European nations formed two separate groups called the Triple Entente and the Triple 
Alliance for the purpose of: 

A. putting all their money together to buy colonies in Asia 

B. sharing resources and technology 

C. having the help of another nation in case of a war 

D. spreading democracy to nations that still had monarchies 

4. How did the Ottomans respond to the demand by Armenians for their own independent nation? 

A. The Ottomans gave them a large portion of their territory and wished them luck 

B. The Ottomans murdered thousands of Armenians and deported thousands more 

C. The Ottomans tried to convince the Armenians that independence was not very important 

D. The Ottomans agreed to sell some of their land to the Armenians 

5 . How did the Alliance System result in World War I? 

A. Several nations entered the war because they had to keep their promised to support one another 

B. Every nation wanted to see Germany get defeated 

C. Nations that did not have any military agreements were forced to fight the war or else lose land 

D. Countries were not satisfied with their current military partners and wanted new allies 

6. Glorifying military power and keeping an army prepared for war is known as: 

A. Nationalism 

B. Militarism 

C. Imperialism 

D. Communism 



A Better Benchmark Assessment 



64 



Appendix B continued 



Revised 1/25/2005 



7. World War I was a “total war” because 

A. regular civilians suffered. 

B. countries from all over the world were involved. 

C. nations at war devoted all their resources to fight it. 

D. soldiers were trained to use many types of weapons 

8 . During World War I propaganda was used to 

A. influence people’s beliefs and opinions 

B. quiet the press. 

C. inform the enemy 

D. strengthen democracy 

9. Which statement is the best example of propaganda that might have been used during World War I? 

A. “It is your duty to help protect our great nation against the evil enemy who wish to destroy us” 

B. “The war is none of your business, therefore no news about it will be printed in the newspapers” 

C. “The enemy is much stronger and smarter than we are, therefore we must surrender at once” 

D. “Vote for the politician who you think has the best idea on how to win this war” 

10. What was the system of rationing designed to limit? 

A. How much information about the war could be printed in newspapers 

B. The amount of supplies people could buy in order for the military to have what it needed 

C. The length of time men had to be away from home while they were in the army 

D. Age requirements for men and women who wished to serve in the military 

1 1 . One soldier who fought in the war wrote: “ In a few minutes the first wave of enemy 

soldiers were wiped out. But wave upon wave kept attacking us. As they got tangled in the barbed 
wire, we used our rifles to kill them off one by one. Most, however never got as far as the wire. 
They lay dead in no man’s land as shells exploded among them and bullets tore through the deadly 
air.” What was this soldier describing? 

A. The successful battle plans made by enemy generals 

B. Trench warfare on the Western Front 

C. Why the British Army was better than the German Army 

D. How easy it was for the enemy to capture more land 

12. How did the fighting on the Eastern Front help cause the stalemate on the Western Front? 

A. Russia was able to give the German the extra supplies they needed to fight the Allies 

B. The German army used the Eastern Front as a military base to train more soldiers 

C. The German army had to split its forces to fight battles in two separate regions 

D. The Allies used the Eastern Front to confuse the German army 

13. Why did the combat on the Western Front in World War I take place in a relatively small area? 

A. There is only a small amount of flat land in all of Europe 

B. The armies became immobile because of trench warfare 

C. Each side cut off the fuel supply of the other 

D. Germany’s military tactics were based on “static warfare” 



A Better Benchmark Assessment 



65 



Appendix B continued 



Revised 1/25/2005 



14. How did the Communists Revolution affect Russia’s participation in World War I? 

A. Russia signed a peace treaty with Germany and stopped fighting in the war 

B. Russia became more dedicated than ever to help the Allies win the war 

C. Russia left the Allied Powers and joined the Central Powers 

D. Russia began to win more battles and was able to defeat the German army 

15. Why did German submarines sink American ships traveling to France and Britain? 

A. Germany was hoping the United States would join the Allied side and fight in the war 

B. The United States was selling supplies to the Allies they needed to fight against Germany 

C. The Germans were trying to prevent the United States from being tricked by the Allies 

D. The Germans hoped to increase the amount of supplies being given to the Allies 

16. How did the entry of the United States on the side of the Allies affect (change) the course 
of World War I? 

A. American soldiers, who were fresh and eager to fight, helped defeat the German army 

B. Nothing changed, the war lasted for ten more years and millions of men still died 

C. The Allies became weaker and had to surrender to the much stronger German army 

D. The Allies gave up and let the Americans do all the fighting 

17. What was the result of the armistice that was signed in 1918? 

A. Both sides agreed to continue the war to the bitter end 

B. Both sides agreed to stop fighting 

C. Both sides agreed to stop using poison gas on each other 

D. Both sides agreed to fight one more battle 

( Standard * 10.6.1 : Post WWI Peace Efforts) 

18. President Wilson said that his Fourteen Points would provide a framework for 

A. a lasting and just peace 

B. determining war reparations 

C. expanding colonial empires 

D. punishing aggressor nations 

19. A major goal of France and Great Britain at the Conference of Versailles following World War I 
was to 

A. create a politically unified Europe 

B. keep Germany from rebuilding its military 

C. restore pre-war imperial government to power 

D. help Germany rebuild its industrial economy 

20. Who was required to take responsibility for the war? 

A. All the nations that fought in the war 

B. The nations that lost the most soldiers 

C. Germany had to accept blame for the war 

D. All those nations that were on the losing side 



A Better Benchmark Assessment 



6 6 



Appendix B continued 



Revised 1/25/2005 



21. The League of Nations was an international association whose goal was 

A. to keep peace 

B. feed children 

C. discover new medicines 

D. fight injustice 

22. Which statement accurately summarizes the human cost of World War I? 

A. Fewer people died in World War I than in any war in history 

B. The losing countries were the only one who lost very many soldiers 

C. The only people who died during this war were the soldiers who were fighting 

D. Millions of people died or were wounded as a result of this war 

23. In 1919, a person called the Versailles Treaty “a peace built on quicksand.” What was that person 
saying about the Versailles Treaty? 

A. The peace treaty was very well put together and peace should last forever 

B. The peace treaty had many flaws and peace would not last very long 

C. The peace treaty satisfied all nations because they got land near the ocean 

D. The peace treaty settled the most serious problems that had caused the war 

24. Why did the United States refuse to join the League of Nations? 

A. The U.S. wanted to continue to fight the war against the Germans 

B. The U.S. was not invited to join the League of Nations 

C. The U.S. thought its best hope for peace was to stay out of European affairs 

D. The U.S. wanted more territory than the League of Nations was planning to give them 



A Better Benchmark Assessment 



67 



Appendix B continued 



Do Not Write On This Test 



GRADE 10 
WORLD HISTORY 
ACES #6 

CLUSTER 4: CAUSES AND EFFECTS OF THE SECOND WORLD WAR 

Multiple Choice 

(STANDARD 10.7.3: Analyze the rise, aggression, and human costs of totalitarian regimes in 
Germany, Italy, and the Soviet Union) 

1 . In the years following World War I, why were some of the newly created democratic governments in 

Europe unpopular with their people? 

A. Democratic governments gave people more rights than they knew what to do with 

B. These governments lacked the experience and the effectiveness to deal with problems 

C. People wanted to return to the old practice of having Absolute Monarchies 

D. The leaders of these governments were assigned to them by the Allies, not chosen through elections 

2. All of these people are considered to have been dictators except: 

A. Joseph Stalin 

B. Adolf Hitler 

C. Franklin Roosevelt 

D. Benito Mussolini 

3 . What was the name of the time period of economic problems that effected the entire world? 

A. National Poverty 

B. Economic Collapse 

C. Business Reformation 

D. Great Depression 

4. What does the term “ totalitarianism ” mean? 

A. Government control over every aspect of public and private life 

B. Government that attempts to give people total freedom 

C. Government that controls what religion its citizens have 

D. Government that is controlled by the people by the use of their vote 

5 . Which is something a leader of a totalitarian government would do to increase his power? 

A. Make sure all the citizens are registered to vote 

B. Change laws in order to give people more freedom 

C. Arrest any person who does not belong to the Church 

D. Use censorship and propaganda to control public opinion 

6. What would a dictator use his “secret police” for? 

A. To investigate crimes against innocent civilians 

B. To arrest those who disagreed with the dictator and force others to be obedient 

C. To spy on the armies of enemy nations to see what they are planning 

D. To protect the leaders of other political groups so they are safe from terrorists 

7. What would a Fascist demand his people do? 

A. Be loyal to the nation and to him (the leader) 

B. Obey one’s parents and grandparents 

C. Remember wars are a terrible and solve nothing 

D. Refuse to pay taxes or go to school 



A Better Benchmark Assessment 



68 



Appendix B continued 



8. Fascist in Germany and Italy gained popularity by promising all of the following except: 

A. Economic recovery 

B. Restore past glory 

C. Give more rights 

D. Protection from Communism 

9. Which nation replaced its Monarchy with a Communist Totalitarian Regime? 

A. Japan 

B. Soviet Union 

C. Great Britain 

D. France 

10. What was the name of the democratic government of Germany after World War I? 

A. Czarist Regime 

B. Kaiser Parliament 

C. Congress of Berlin 

D. Weimar Republic 

1 1 . All of the following are reasons why Germany was angry towards the Allied nations and the 

Treaty of Versailles except: 

A. Germany got more land than they could afford to take care of 

B. Germany were humiliated by being blamed for World War I 

C. Germany had to pay the Allies millions of dollars for reparations 

D. Germany was forced to give up some of its territory 

12. The title of Hitler’s book, Mein Kampf in English means: 

A. “My Struggle” 

B. “My Country” 

C. “Master Race” 

D. “Revenge” 

13. Hitler’s main method of getting lebensraum was to: 

A. Attack the liberals 

B. Conquer other countries 

C. Form a secret police force 

D. Give people more rights 

14. What would have been the consequence for disobeying or disagreeing with a dictator? 

A. A person would be forced to pay a fine and do community service 

B. A person would either be sent to prison or executed 

C. A person would be forced to join the army for at least four years 

D. A person would be given a lawyer and forced to testify before a judge 



15. What does the term indoctrination mean? 

A. Teaching others a specific set of beliefs and values 

B. Giving unemployed workers new jobs and benefits 

C. Forcing people to surrender their citizenship and leave a country 

D. Respecting people’s thoughts and point of view 



A Better Benchmark Assessment 



6 



Appendix B continued 



Read the passage below and answer questions 16 through 20 that follow the reading. 



The Oath to Adolf Hitler 

Speech by Rudolf Hess on 25 February 1934 

Background: On 25 February 1934, about a million Nazi party officials gathered at points around Germany 
to swear an oath to Adolf Hitler. This is an excerpt from the speech Rudolf Hess gave on the occasion, which 
was broadcast to the nation. 

The source: "Der Eid auf Adolf Hitler,” Rudolf Hess, Reden (Munich: Zentralverlag der NSDAP, 1938), pp. 
10-14. 



German men, German women, German boys, German girls, over a million of you are gathered in many 
places in all of Germany! 

On this the anniversary of the proclamation of the Party's program, you will together swear an oath of loyalty 
and obedience to Adolf Hitler. You will display to the world what has long been obvious to you, and what 
you have expressed in past years, often unconsciously. You are swearing you oath on a holiday that Germany 
celebrates for the first time: Heroes' Memorial Day. We lower our flags in remembrance of those who lived 
as heroes, and who died as heroes. We lower the flags before the giants of our past, before those who fought 
for Germany, before the millions who fought in the World War, before those who died preparing the way for 
the new Reich. 

Woe to the people that fails to honor its heroes! It will cease producing them, cease knowing them. Heroes 
spring from the essence of their people. A people without heroes is a people without leaders, for only a heroic 
leader is a true leader able to withstand the challenge of difficult times. The rise or fall of a people can be 
determined by the presence or absence of a leader. 

The battle-ready manly heroes and the quiet sacrifices of mothers and women are holy examples of loyalty 
for us Germans. The flags that we now raise once more are the symbols of this loyalty, which for Nordic 
mankind is closely bound to heroism! 

Hitler Youths, you have given the same absolute loyalty to the Fiihrer that Germany's young volunteers gave 
twenty years ago at Langemarck, which demanded their heroic deaths for our people and the Reich. You have 
the good fortune to live in a Reich that the best warriors of 1 914 could only dream of — a Reich that for all 
eternity will remain united if you do your duty. For you, doing your duty means: Obey the Fiihrer's orders 
without question! 

I say to the political leaders what I said to your comrades in Gau Thuringia as they were sworn in last year: 

Be true to Hitler's spirit! Ask in all that you do: What would the Fiihrer do. If you act accordingly, you will 
not go wrong! Being true to Hitler's spirit means always being an model. "To be a leader is to be an 
example," just as Hitler and his work are an example for you. It means that no matter what, always to be a 
servant of the total National Socialism of Adolf Hitler, to be a fully conscious, heartfelt follower of the Fiihrer 
above all else. 

Be ever aware that, wherever you are, you owe thanks to the Fiihrer, for his leadership enabled every victory. 
Wherever you are, be it high or low, work for his movement, and therefore for Germany. Remember what 
Adolf Hitler says: it makes no difference if one is a street cleaner or a professor, as long as he works for the 
whole and does his duty. The reward for your labors is the feeling of having done one's duty for the 
movement, for Adolf Hitler, for Germany. Each of you is as unique in history as National Socialism itself. 

Your oath is not a mere formality; you do not swear this oath to someone unknown to you. You do not swear 
in hope, but with certainty. Fate has made it easy for you to take this oath without condition or reservation. 



A Better Benchmark Assessment 



70 



Appendix B continued 



Never in history has a people taken an oath to a leader with such absolute confidence as the German people 
have in Adolf Hitler. You have the enormous joy of taking an oath to a man who is the embodiment of a 
leader. You take an oath to the fighter who demonstrated his leadership over a decade, who always acts 
correctly and who always chose the right way, even when at times the larger part of his movement failed to 
understand why. 

You take an oath to a man whom you know follows the laws of providence, which he obeys independently of 
the influence of earthly powers, who leads the German people rightly, and who will guide Germany's fate. 
Through your oath you bind yourselves to a man who — that is our faith — was sent to us by higher powers. 
Do not seek Adolf Hitler with your mind. You will find him through he strength of your hearts ! Adolf Hitler 
is Germany and Germany is Adolf Hitler. He who takes an oath to Hitler takes an oath to Germany! 

Inferring Main Ideas 

16. What do you think was the main idea of this passage? 

A. The Nazi government was interested in Germany getting colonies in Asia. 

B. German men, women and children were expected to be loyal to Germany and Hitler. 

C. Explaining the reasons why World War I and World War II happened in Europe. 

D. The importance of citizens to respect the religion and rights of other people. 

Recognizing Facts and Details 

17. Who was about to take the oath to Hitler? 

A. German men, women and children 

B. Men, women and children from all over the world 

C. Men, women, and children who were afraid of Hitler 

D. German veterans who had been wounded during World War I 

Making Inferences 

18. From this reading you can infer (conclude): 

A. World War I had not been fought when this speech was given 

B. Very few people knew whom Hitler was when they swore loyalty to him 

C. Hitler was already the leader of Germany when this oath was taken 

D. Rudolf Hess (the speaker) did not think Hitler was worthy of respect 

Drawing Conclusions 

19. Which conclusion can you draw form this reading? 

A. People from France were proud to make this oath to Hitler 

B. Hitler wanted the German people to be obedient and loyal to him 

C. This speech made it clear that Hitler did not like the Jews 

D. This was a secret oath that only the Nazis were allowed to know about 

Distinguishing Fact from Fiction 

20. Which statement is an opinion? 

A. About a million Germans were about to take an oath to Hitler 

B. This speech was given in 1934 

C. Most Germans made this oath because they were afraid of Hitler 

D. The German people were told to have faith and confidence in Hitler 



